{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example - Multi-lingual semantic search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lancedb Embeddings API: Multi-lingual semantic search\n",
    "In this example, we'll build a simple LanceDB table containing embeddings for different languages that can be used for universal semantic search.\n",
    "* The **Dataset** used will be wikipedia dataset in English and French\n",
    "* The **Model** used will be cohere's multi-lingual model\n",
    "\n",
    "In this example, we'll explore LanceDB's Embeddings API that allows you to create tables that automatically vectorize data once you define the config at the time of table creation. Let's dive right in!\n",
    "\n",
    "To learn more about LanceDB, visit [our docs](https://lancedb.github.io/lancedb/)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!pip install -qU datasets cohere openai lancedb\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create datasets\n",
    "For accessing the datasets, we'll use datasets library in streaming mode. We'll use english and french versions and embed them together. For semantic search the order should be irrelevant"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/ayush/vectordb-recipes/env/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "en = dataset = load_dataset(\"wikipedia\", \"20220301.en\", streaming=True,)\n",
    "fr = load_dataset(\"wikipedia\", \"20220301.fr\", streaming=True)\n",
    "\n",
    "datasets = {\"english\": iter(en['train']), \"french\": iter(fr['train'])}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at the dataset format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': '12',\n",
       " 'url': 'https://en.wikipedia.org/wiki/Anarchism',\n",
       " 'title': 'Anarchism',\n",
       " 'text': 'Anarchism is a political philosophy and movement that is sceptical of authority and rejects all involuntary, coercive forms of hierarchy. Anarchism calls for the abolition of the state, which it holds to be unnecessary, undesirable, and harmful. As a historically left-wing movement, placed on the farthest left of the political spectrum, it is usually described alongside communalism and libertarian Marxism as the libertarian wing (libertarian socialism) of the socialist movement, and has a strong historical association with anti-capitalism and socialism.\\n\\nHumans lived in societies without formal hierarchies long before the establishment of formal states, realms, or empires. With the rise of organised hierarchical bodies, scepticism toward authority also rose. Although traces of anarchist thought are found throughout history, modern anarchism emerged from the Enlightenment. During the latter half of the 19th and the first decades of the 20th century, the anarchist movement flourished in most parts of the world and had a significant role in workers\\' struggles for emancipation. Various anarchist schools of thought formed during this period. Anarchists have taken part in several revolutions, most notably in the Paris Commune, the Russian Civil War and the Spanish Civil War, whose end marked the end of the classical era of anarchism. In the last decades of the 20th and into the 21st century, the anarchist movement has been resurgent once more.\\n\\nAnarchism employs a diversity of tactics in order to meet its ideal ends which can be broadly separated into revolutionary and evolutionary tactics; there is significant overlap between the two, which are merely descriptive. Revolutionary tactics aim to bring down authority and state, having taken a violent turn in the past, while evolutionary tactics aim to prefigure what an anarchist society would be like. Anarchist thought, criticism, and praxis have played a part in diverse areas of human society. Criticism of anarchism include claims that it is internally inconsistent, violent, or utopian.\\n\\nEtymology, terminology, and definition \\n\\nThe etymological origin of anarchism is from the Ancient Greek anarkhia, meaning \"without a ruler\", composed of the prefix an- (\"without\") and the word arkhos (\"leader\" or \"ruler\"). The suffix -ism denotes the ideological current that favours anarchy. Anarchism appears in English from 1642 as anarchisme and anarchy from 1539; early English usages emphasised a sense of disorder. Various factions within the French Revolution labelled their opponents as anarchists, although few such accused shared many views with later anarchists. Many revolutionaries of the 19th century such as William Godwin (1756–1836) and Wilhelm Weitling (1808–1871) would contribute to the anarchist doctrines of the next generation but did not use anarchist or anarchism in describing themselves or their beliefs.\\n\\nThe first political philosopher to call himself an anarchist () was Pierre-Joseph Proudhon (1809–1865), marking the formal birth of anarchism in the mid-19th century. Since the 1890s and beginning in France, libertarianism has often been used as a synonym for anarchism and its use as a synonym is still common outside the United States. Some usages of libertarianism refer to individualistic free-market philosophy only, and free-market anarchism in particular is termed libertarian anarchism.\\n\\nWhile the term libertarian has been largely synonymous with anarchism, its meaning has more recently diluted with wider adoption from ideologically disparate groups, including both the New Left and libertarian Marxists, who do not associate themselves with authoritarian socialists or a vanguard party, and extreme cultural liberals, who are primarily concerned with civil liberties. Additionally, some anarchists use libertarian socialist to avoid anarchism\\'s negative connotations and emphasise its connections with socialism. Anarchism is broadly used to describe the anti-authoritarian wing of the socialist movement. Anarchism is contrasted to socialist forms which are state-oriented or from above. Scholars of anarchism generally highlight anarchism\\'s socialist credentials and criticise attempts at creating dichotomies between the two. Some scholars describe anarchism as having many influences from liberalism, and being both liberals and socialists but more so, while most scholars reject anarcho-capitalism as a misunderstanding of anarchist principles.\\n\\nWhile opposition to the state is central to anarchist thought, defining anarchism is not an easy task for scholars, as there is a lot of discussion among scholars and anarchists on the matter, and various currents perceive anarchism slightly differently. Major definitional elements include the will for a non-coercive society, the rejection of the state apparatus, the belief that human nature allows humans to exist in or progress toward such a non-coercive society, and a suggestion on how to act to pursue the ideal of anarchy.\\n\\nHistory\\n\\nPre-modern era \\n\\nBefore the establishment of towns and cities, an established authority did not exist. It was after the creation of institutions of authority that anarchistic ideas espoused as a reaction. The most notable precursors to anarchism in the ancient world were in China and Greece. In China, philosophical anarchism (the discussion on the legitimacy of the state) was delineated by Taoist philosophers Zhuang Zhou and Laozi. Alongside Stoicism, Taoism has been said to have had \"significant anticipations\" of anarchism.\\n \\nAnarchic attitudes were also articulated by tragedians and philosophers in Greece. Aeschylus and Sophocles used the myth of Antigone to illustrate the conflict between rules set by the state and personal autonomy. Socrates questioned Athenian authorities constantly and insisted on the right of individual freedom of conscience. Cynics dismissed human law (nomos) and associated authorities while trying to live according to nature (physis). Stoics were supportive of a society based on unofficial and friendly relations among its citizens without the presence of a state.\\n\\nIn medieval Europe, there was no anarchistic activity except some ascetic religious movements. These, and other Muslim movements, later gave birth to religious anarchism. In the Sasanian Empire, Mazdak called for an egalitarian society and the abolition of monarchy, only to be soon executed by Emperor Kavad I.\\n\\nIn Basra, religious sects preached against the state. In Europe, various sects developed anti-state and libertarian tendencies. Renewed interest in antiquity during the Renaissance and in private judgment during the Reformation restored elements of anti-authoritarian secularism, particularly in France. Enlightenment challenges to intellectual authority (secular and religious) and the revolutions of the 1790s and 1848 all spurred the ideological development of what became the era of classical anarchism.\\n\\nModern era \\nDuring the French Revolution, partisan groups such as the Enragés and the  saw a turning point in the fermentation of anti-state and federalist sentiments. The first anarchist currents developed throughout the 18th century as William Godwin espoused philosophical anarchism in England, morally delegitimising the state, Max Stirner\\'s thinking paved the way to individualism and Pierre-Joseph Proudhon\\'s theory of mutualism found fertile soil in France. By the late 1870s, various anarchist schools of thought had become well-defined and a wave of then unprecedented globalisation occurred from 1880 to 1914. This era of classical anarchism lasted until the end of the Spanish Civil War and is considered the golden age of anarchism.\\n\\nDrawing from mutualism, Mikhail Bakunin founded collectivist anarchism and entered the International Workingmen\\'s Association, a class worker union later known as the First International that formed in 1864 to unite diverse revolutionary currents. The International became a significant political force, with Karl Marx being a leading figure and a member of its General Council. Bakunin\\'s faction (the Jura Federation) and Proudhon\\'s followers (the mutualists) opposed state socialism, advocating political abstentionism and small property holdings. After bitter disputes, the Bakuninists were expelled from the International by the Marxists at the 1872 Hague Congress. Anarchists were treated similarly in the Second International, being ultimately expelled in 1896. Bakunin famously predicted that if revolutionaries gained power by Marx\\'s terms, they would end up the new tyrants of workers. In response to their expulsion from the First International, anarchists formed the St. Imier International. Under the influence of Peter Kropotkin, a Russian philosopher and scientist, anarcho-communism overlapped with collectivism. Anarcho-communists, who drew inspiration from the 1871 Paris Commune, advocated for free federation and for the distribution of goods according to one\\'s needs.\\n\\nAt the turn of the century, anarchism had spread all over the world. It was a notable feature of the international syndicalism movement. In China, small groups of students imported the humanistic pro-science version of anarcho-communism. Tokyo was a hotspot for rebellious youth from countries of the far east, travelling to the Japanese capital to study. In Latin America, Argentina was a stronghold for anarcho-syndicalism, where it became the most prominent left-wing ideology. During this time, a minority of anarchists adopted tactics of revolutionary political violence. This strategy became known as propaganda of the deed. The dismemberment of the French socialist movement into many groups and the execution and exile of many Communards to penal colonies following the suppression of the Paris Commune favoured individualist political expression and acts. Even though many anarchists distanced themselves from these terrorist acts, infamy came upon the movement and attempts were made to exclude them from American immigration, including the Immigration Act of 1903, also called the Anarchist Exclusion Act. Illegalism was another strategy which some anarchists adopted during this period.\\n\\nDespite concerns, anarchists enthusiastically participated in the Russian Revolution in opposition to the White movement; however, they met harsh suppression after the Bolshevik government was stabilised. Several anarchists from Petrograd and Moscow fled to Ukraine, notably leading to the Kronstadt rebellion and Nestor Makhno\\'s struggle in the Free Territory. With the anarchists being crushed in Russia, two new antithetical currents emerged, namely platformism and synthesis anarchism. The former sought to create a coherent group that would push for revolution while the latter were against anything that would resemble a political party. Seeing the victories of the Bolsheviks in the October Revolution and the resulting Russian Civil War, many workers and activists turned to communist parties which grew at the expense of anarchism and other socialist movements. In France and the United States, members of major syndicalist movements such as the General Confederation of Labour and the Industrial Workers of the World left their organisations and joined the Communist International.\\n\\nIn the Spanish Civil War of 1936, anarchists and syndicalists (CNT and FAI) once again allied themselves with various currents of leftists. A long tradition of Spanish anarchism led to anarchists playing a pivotal role in the war. In response to the army rebellion, an anarchist-inspired movement of peasants and workers, supported by armed militias, took control of Barcelona and of large areas of rural Spain, where they collectivised the land. The Soviet Union provided some limited assistance at the beginning of the war, but the result was a bitter fight among communists and anarchists at a series of events named May Days as Joseph Stalin tried to seize control of the Republicans.\\n\\nPost-war era \\n\\nAt the end of World War II, the anarchist movement was severely weakened. The 1960s witnessed a revival of anarchism, likely caused by a perceived failure of Marxism–Leninism and tensions built by the Cold War. During this time, anarchism found a presence in other movements critical towards both capitalism and the state such as the anti-nuclear, environmental, and peace movements, the counterculture of the 1960s, and the New Left. It also saw a transition from its previous revolutionary nature to provocative anti-capitalist reformism. Anarchism became associated with punk subculture as exemplified by bands such as Crass and the Sex Pistols. The established feminist tendencies of anarcha-feminism returned with vigour during the second wave of feminism. Black anarchism began to take form at this time and influenced anarchism\\'s move from a Eurocentric demographic. This coincided with its failure to gain traction in Northern Europe and its unprecedented height in Latin America.\\n\\nAround the turn of the 21st century, anarchism grew in popularity and influence within anti-capitalist, anti-war and anti-globalisation movements. Anarchists became known for their involvement in protests against the World Trade Organization (WTO), the Group of Eight and the World Economic Forum. During the protests, ad hoc leaderless anonymous cadres known as black blocs engaged in rioting, property destruction and violent confrontations with the police. Other organisational tactics pioneered in this time include affinity groups, security culture and the use of decentralised technologies such as the Internet. A significant event of this period was the confrontations at the 1999 Seattle WTO conference. Anarchist ideas have been influential in the development of the Zapatistas in Mexico and the Democratic Federation of Northern Syria, more commonly known as Rojava, a de facto autonomous region in northern Syria.\\n\\nThought \\n\\nAnarchist schools of thought have been generally grouped into two main historical traditions, social anarchism and individualist anarchism, owing to their different origins, values and evolution. The individualist current emphasises negative liberty in opposing restraints upon the free individual, while the social current emphasises positive liberty in aiming to achieve the free potential of society through equality and social ownership. In a chronological sense, anarchism can be segmented by the classical currents of the late 19th century and the post-classical currents (anarcha-feminism, green anarchism, and post-anarchism) developed thereafter.\\n\\nBeyond the specific factions of anarchist movements which constitute political anarchism lies philosophical anarchism which holds that the state lacks moral legitimacy, without necessarily accepting the imperative of revolution to eliminate it. A component especially of individualist anarchism, philosophical anarchism may tolerate the existence of a minimal state but claims that citizens have no moral obligation to obey government when it conflicts with individual autonomy. Anarchism pays significant attention to moral arguments since ethics have a central role in anarchist philosophy. Anarchism\\'s emphasis on anti-capitalism, egalitarianism, and for the extension of community and individuality sets it apart from anarcho-capitalism and other types of economic libertarianism.\\n\\nAnarchism is usually placed on the far-left of the political spectrum. Much of its economics and legal philosophy reflect anti-authoritarian, anti-statist, libertarian, and radical interpretations of left-wing and socialist politics such as collectivism, communism, individualism, mutualism, and syndicalism, among other libertarian socialist economic theories. As anarchism does not offer a fixed body of doctrine from a single particular worldview, many anarchist types and traditions exist and varieties of anarchy diverge widely. One reaction against sectarianism within the anarchist milieu was anarchism without adjectives, a call for toleration and unity among anarchists first adopted by Fernando Tarrida del Mármol in 1889 in response to the bitter debates of anarchist theory at the time. Belief in political nihilism has been espoused by anarchists. Despite separation, the various anarchist schools of thought are not seen as distinct entities but rather as tendencies that intermingle and are connected through a set of uniform principles such as individual and local autonomy, mutual aid, network organisation, communal democracy, justified authority and decentralisation.\\n\\nClassical \\n\\nInceptive currents among classical anarchist currents were mutualism and individualism. They were followed by the major currents of social anarchism (collectivist, communist and syndicalist). They differ on organisational and economic aspects of their ideal society.\\n\\nMutualism is an 18th-century economic theory that was developed into anarchist theory by Pierre-Joseph Proudhon. Its aims include reciprocity, free association, voluntary contract, federation and monetary reform of both credit and currency that would be regulated by a bank of the people. Mutualism has been retrospectively characterised as ideologically situated between individualist and collectivist forms of anarchism. In What Is Property? (1840), Proudhon first characterised his goal as a \"third form of society, the synthesis of communism and property.\" Collectivist anarchism is a revolutionary socialist form of anarchism commonly associated with Mikhail Bakunin. Collectivist anarchists advocate collective ownership of the means of production which is theorised to be achieved through violent revolution and that workers be paid according to time worked, rather than goods being distributed according to need as in communism. Collectivist anarchism arose alongside Marxism but rejected the dictatorship of the proletariat despite the stated Marxist goal of a collectivist stateless society.\\n\\nAnarcho-communism is a theory of anarchism that advocates a communist society with common ownership of the means of production, direct democracy and a horizontal network of voluntary associations, workers\\' councils and worker cooperatives, with production and consumption based on the guiding principle \"From each according to his ability, to each according to his need.\" Anarcho-communism developed from radical socialist currents after the French Revolution but was first formulated as such in the Italian section of the First International. It was later expanded upon in the theoretical work of Peter Kropotkin, whose specific style would go onto become the dominating view of anarchists by the late 19th century. Anarcho-syndicalism is a branch of anarchism that views labour syndicates as a potential force for revolutionary social change, replacing capitalism and the state with a new society democratically self-managed by workers. The basic principles of anarcho-syndicalism are direct action, workers\\' solidarity and workers\\' self-management.\\n\\nIndividualist anarchism is a set of several traditions of thought within the anarchist movement that emphasise the individual and their will over any kinds of external determinants. Early influences on individualist forms of anarchism include William Godwin, Max Stirner, and Henry David Thoreau. Through many countries, individualist anarchism attracted a small yet diverse following of Bohemian artists and intellectuals as well as young anarchist outlaws in what became known as illegalism and individual reclamation.\\n\\nPost-classical and contemporary \\n\\nAnarchist principles undergird contemporary radical social movements of the left. Interest in the anarchist movement developed alongside momentum in the anti-globalisation movement, whose leading activist networks were anarchist in orientation. As the movement shaped 21st century radicalism, wider embrace of anarchist principles signaled a revival of interest. Anarchism has continued to generate many philosophies and movements, at times eclectic, drawing upon various sources and combining disparate concepts to create new philosophical approaches. The anti-capitalist tradition of classical anarchism has remained prominent within contemporary currents.\\n\\nContemporary news coverage which emphasizes black bloc demonstrations has reinforced anarchism\\'s historical association with chaos and violence. Its publicity has also led more scholars in fields such as anthropology and history to engage with the anarchist movement, although contemporary anarchism favours actions over academic theory. Various anarchist groups, tendencies, and schools of thought exist today, making it difficult to describe the contemporary anarchist movement. While theorists and activists have established \"relatively stable constellations of anarchist principles\", there is no consensus on which principles are core and commentators describe multiple anarchisms, rather than a singular anarchism, in which common principles are shared between schools of anarchism while each group prioritizes those principles differently. Gender equality can be a common principle, although it ranks as a higher priority to anarcha-feminists than anarcho-communists.\\n\\nAnarchists are generally committed against coercive authority in all forms, namely \"all centralized and hierarchical forms of government (e.g., monarchy, representative democracy, state socialism, etc.), economic class systems (e.g., capitalism, Bolshevism, feudalism, slavery, etc.), autocratic religions (e.g., fundamentalist Islam, Roman Catholicism, etc.), patriarchy, heterosexism, white supremacy, and imperialism.\" Anarchist schools disagree on the methods by which these forms should be opposed. The principle of equal liberty is closer to anarchist political ethics in that it transcends both the liberal and socialist traditions. This entails that liberty and equality cannot be implemented within the state, resulting in the questioning of all forms of domination and hierarchy.\\n\\nTactics \\nAnarchists\\' tactics take various forms but in general serve two major goals, namely to first oppose the Establishment and secondly to promote anarchist ethics and reflect an anarchist vision of society, illustrating the unity of means and ends. A broad categorisation can be made between aims to destroy oppressive states and institutions by revolutionary means on one hand and aims to change society through evolutionary means on the other. Evolutionary tactics embrace nonviolence, reject violence and take a gradual approach to anarchist aims, although there is significant overlap between the two.\\n\\nAnarchist tactics have shifted during the course of the last century. Anarchists during the early 20th century focused more on strikes and militancy while contemporary anarchists use a broader array of approaches.\\n\\nClassical era tactics \\n\\nDuring the classical era, anarchists had a militant tendency. Not only did they confront state armed forces, as in Spain and Ukraine, but some of them also employed terrorism as propaganda of the deed. Assassination attempts were carried out against heads of state, some of which were successful. Anarchists also took part in revolutions. Many anarchists, especially the Galleanists, believed that these attempts would be the impetus for a revolution against capitalism and the state. Many of these attacks were done by individual assailants and the majority took place in the late 1870s, the early 1880s and the 1890s, with some still occurring in the early 1900s. Their decrease in prevalence was the result of further judicial power and targeting and cataloging by state institutions.\\n\\nAnarchist perspectives towards violence have always been controversial. Anarcho-pacifists advocate for non-violence means to achieve their stateless, nonviolent ends. Other anarchist groups advocate direct action, a tactic which can include acts of sabotage or terrorism. This attitude was quite prominent a century ago when seeing the state as a tyrant and some anarchists believing that they had every right to oppose its oppression by any means possible. Emma Goldman and Errico Malatesta, who were proponents of limited use of violence, stated that violence is merely a reaction to state violence as a necessary evil.\\n\\nAnarchists took an active role in strike actions, although they tended to be antipathetic to formal syndicalism, seeing it as reformist. They saw it as a part of the movement which sought to overthrow the state and capitalism. Anarchists also reinforced their propaganda within the arts, some of whom practiced naturism and nudism. Those anarchists also built communities which were based on friendship and were involved in the news media.\\n\\nRevolutionary tactics \\n\\nIn the current era, Italian anarchist Alfredo Bonanno, a proponent of insurrectionary anarchism, has reinstated the debate on violence by rejecting the nonviolence tactic adopted since the late 19th century by Kropotkin and other prominent anarchists afterwards. Both Bonanno and the French group The Invisible Committee advocate for small, informal affiliation groups, where each member is responsible for their own actions but works together to bring down oppression utilizing sabotage and other violent means against state, capitalism, and other enemies. Members of The Invisible Committee were arrested in 2008 on various charges, terrorism included.\\n\\nOverall, contemporary anarchists are much less violent and militant than their ideological ancestors. They mostly engage in confronting the police during demonstrations and riots, especially in countries such as Canada, Greece, and Mexico. Militant black bloc protest groups are known for clashing with the police; however, anarchists not only clash with state operators, they also engage in the struggle against fascists and racists, taking anti-fascist action and mobilizing to prevent hate rallies from happening.\\n\\nEvolutionary tactics \\nAnarchists commonly employ direct action. This can take the form of disrupting and protesting against unjust hierarchy, or the form of self-managing their lives through the creation of counter-institutions such as communes and non-hierarchical collectives. Decision-making is often handled in an anti-authoritarian way, with everyone having equal say in each decision, an approach known as horizontalism. Contemporary-era anarchists have been engaging with various grassroots movements that are more or less based on horizontalism, although not explicitly anarchist, respecting personal autonomy and participating in mass activism such as strikes and demonstrations. In contrast with the big-A anarchism of the classical era, the newly coined term small-a anarchism signals their tendency not to base their thoughts and actions on classical-era anarchism or to refer to classical anarchists such as Peter Kropotkin and Pierre-Joseph Proudhon to justify their opinions. Those anarchists would rather base their thought and praxis on their own experience which they will later theorize.\\n\\nThe decision-making process of small anarchist affinity groups plays a significant tactical role. Anarchists have employed various methods in order to build a rough consensus among members of their group without the need of a leader or a leading group. One way is for an individual from the group to play the role of facilitator to help achieve a consensus without taking part in the discussion themselves or promoting a specific point. Minorities usually accept rough consensus, except when they feel the proposal contradicts anarchist ethics, goals and values. Anarchists usually form small groups (5–20 individuals) to enhance autonomy and friendships among their members. These kinds of groups more often than not interconnect with each other, forming larger networks. Anarchists still support and participate in strikes, especially wildcat strikes as these are leaderless strikes not organised centrally by a syndicate.\\n\\nAs in the past, newspapers and journals are used, and anarchists have gone online in the World Wide Web to spread their message. Anarchists have found it easier to create websites because of distributional and other difficulties, hosting electronic libraries and other portals. Anarchists were also involved in developing various software that are available for free. The way these hacktivists work to develop and distribute resembles the anarchist ideals, especially when it comes to preserving users\\' privacy from state surveillance.\\n\\nAnarchists organize themselves to squat and reclaim public spaces. During important events such as protests and when spaces are being occupied, they are often called Temporary Autonomous Zones (TAZ), spaces where art, poetry, and surrealism are blended to display the anarchist ideal. As seen by anarchists, squatting is a way to regain urban space from the capitalist market, serving pragmatical needs and also being an exemplary direct action. Acquiring space enables anarchists to experiment with their ideas and build social bonds. Adding up these tactics while having in mind that not all anarchists share the same attitudes towards them, along with various forms of protesting at highly symbolic events, make up a carnivalesque atmosphere that is part of contemporary anarchist vividity.\\n\\nKey issues \\n\\nAs anarchism is a philosophy that embodies many diverse attitudes, tendencies, and schools of thought; disagreement over questions of values, ideology, and tactics is common. Its diversity has led to widely different uses of identical terms among different anarchist traditions which has created a number of definitional concerns in anarchist theory. The compatibility of capitalism, nationalism, and religion with anarchism is widely disputed, and anarchism enjoys complex relationships with ideologies such as communism, collectivism, Marxism, and trade unionism. Anarchists may be motivated by humanism, divine authority, enlightened self-interest, veganism, or any number of alternative ethical doctrines. Phenomena such as civilisation, technology (e.g. within anarcho-primitivism), and the democratic process may be sharply criticised within some anarchist tendencies and simultaneously lauded in others.\\n\\nGender, sexuality, and free love \\n\\nAs gender and sexuality carry along them dynamics of hierarchy, many anarchists address, analyse, and oppose the suppression of one\\'s autonomy imposed by gender roles.\\n\\nSexuality was not often discussed by classical anarchists but the few that did felt that an anarchist society would lead to sexuality naturally developing. Sexual violence was a concern for anarchists such as Benjamin Tucker, who opposed age of consent laws, believing they would benefit predatory men. A historical current that arose and flourished during 1890 and 1920 within anarchism was free love. In contemporary anarchism, this current survives as a tendency to support polyamory and queer anarchism. Free love advocates were against marriage, which they saw as a way of men imposing authority over women, largely because marriage law greatly favoured the power of men. The notion of free love was much broader and included a critique of the established order that limited women\\'s sexual freedom and pleasure. Those free love movements contributed to the establishment of communal houses, where large groups of travelers, anarchists and other activists slept in beds together. Free love had roots both in Europe and the United States; however, some anarchists struggled with the jealousy that arose from free love. Anarchist feminists were advocates of free love, against marriage, and pro-choice (utilising a contemporary term), and had a similar agenda. Anarchist and non-anarchist feminists differed on suffrage but were supportive of one another.\\n\\nDuring the second half of the 20th century, anarchism intermingled with the second wave of feminism, radicalising some currents of the feminist movement and being influenced as well. By the latest decades of the 20th century, anarchists and feminists were advocating for the rights and autonomy of women, gays, queers and other marginalised groups, with some feminist thinkers suggesting a fusion of the two currents. With the third wave of feminism, sexual identity and compulsory heterosexuality became a subject of study for anarchists, yielding a post-structuralist critique of sexual normality. Some anarchists distanced themselves from this line of thinking, suggesting that it leaned towards an individualism that was dropping the cause of social liberation.\\n\\nAnarchism and education \\n\\nThe interest of anarchists in education stretches back to the first emergence of classical anarchism. Anarchists consider proper education, one which sets the foundations of the future autonomy of the individual and the society, to be an act of mutual aid. Anarchist writers such as William Godwin (Political Justice) and Max Stirner (\"The False Principle of Our Education\") attacked both state education and private education as another means by which the ruling class replicate their privileges.\\n\\nIn 1901, Catalan anarchist and free thinker Francisco Ferrer established the Escuela Moderna in Barcelona as an opposition to the established education system which was dictated largely by the Catholic Church. Ferrer\\'s approach was secular, rejecting both state and church involvement in the educational process whilst giving pupils large amounts of autonomy in planning their work and attendance. Ferrer aimed to educate the working class and explicitly sought to foster class consciousness among students. The school closed after constant harassment by the state and Ferrer was later arrested. Nonetheless, his ideas formed the inspiration for a series of modern schools around the world. Christian anarchist Leo Tolstoy, who published the essay Education and Culture, also established a similar school with its founding principle being that \"for education to be effective it had to be free.\" In a similar token, A. S. Neill founded what became the Summerhill School in 1921, also declaring being free from coercion.\\n\\nAnarchist education is based largely on the idea that a child\\'s right to develop freely and without manipulation ought to be respected and that rationality would lead children to morally good conclusions; however, there has been little consensus among anarchist figures as to what constitutes manipulation. Ferrer believed that moral indoctrination was necessary and explicitly taught pupils that equality, liberty and social justice were not possible under capitalism, along with other critiques of government and nationalism.\\n\\nLate 20th century and contemporary anarchist writers (Paul Goodman, Herbert Read, and Colin Ward) intensified and expanded the anarchist critique of state education, largely focusing on the need for a system that focuses on children\\'s creativity rather than on their ability to attain a career or participate in consumerism as part of a consumer society. Contemporary anarchists such as Ward claim that state education serves to perpetuate socioeconomic inequality.\\n\\nWhile few anarchist education institutions have survived to the modern-day, major tenets of anarchist schools, among them respect for child autonomy and relying on reasoning rather than indoctrination as a teaching method, have spread among mainstream educational institutions. Judith Suissa names three schools as explicitly anarchists schools, namely the Free Skool Santa Cruz in the United States which is part of a wider American-Canadian network of schools, the Self-Managed Learning College in Brighton, England, and the Paideia School in Spain.\\n\\nAnarchism and the state \\n\\nObjection to the state and its institutions is a sine qua non of anarchism. Anarchists consider the state as a tool of domination and believe it to be illegitimate regardless of its political tendencies. Instead of people being able to control the aspects of their life, major decisions are taken by a small elite. Authority ultimately rests solely on power, regardless of whether that power is open or transparent, as it still has the ability to coerce people. Another anarchist argument against states is that the people constituting a government, even the most altruistic among officials, will unavoidably seek to gain more power, leading to corruption. Anarchists consider the idea that the state is the collective will of the people to be an unachievable fiction due to the fact that the ruling class is distinct from the rest of society.\\n\\nSpecific anarchist attitudes towards the state vary. Robert Paul Wolff believed that the tension between authority and autonomy would mean the state could never be legitimate. Bakunin saw the state as meaning \"coercion, domination by means of coercion, camouflaged if possible but unceremonious and overt if need be.\" A. John Simmons and Leslie Green, who leaned toward philosophical anarchism, believed that the state could be legitimate if it is governed by consensus, although they saw this as highly unlikely. Beliefs on how to abolish the state also differ.\\n\\nAnarchism and the arts \\n\\nThe connection between anarchism and art was quite profound during the classical era of anarchism, especially among artistic currents that were developing during that era such as futurists, surrealists and others. In literature, anarchism was mostly associated with the New Apocalyptics and the neo-romanticism movement. In music, anarchism has been associated with music scenes such as punk. Anarchists such as Leo Tolstoy and Herbert Read stated that the border between the artist and the non-artist, what separates art from a daily act, is a construct produced by the alienation caused by capitalism and it prevents humans from living a joyful life.\\n\\nOther anarchists advocated for or used art as a means to achieve anarchist ends. In his book Breaking the Spell: A History of Anarchist Filmmakers, Videotape Guerrillas, and Digital Ninjas, Chris Robé claims that \"anarchist-inflected practices have increasingly structured movement-based video activism.\" Throughout the 20th century, many prominent anarchists (Peter Kropotkin, Emma Goldman, Gustav Landauer and Camillo Berneri) and publications such as Anarchy wrote about matters pertaining to the arts.\\n\\nThree overlapping properties made art useful to anarchists. It could depict a critique of existing society and hierarchies, serve as a prefigurative tool to reflect the anarchist ideal society and even turn into a means of direct action such as in protests. As it appeals to both emotion and reason, art could appeal to the whole human and have a powerful effect. The 19th-century neo-impressionist movement had an ecological aesthetic and offered an example of an anarchist perception of the road towards socialism. In Les chataigniers a Osny by anarchist painter Camille Pissarro, the blending of aesthetic and social harmony is prefiguring an ideal anarchistic agrarian community.\\n\\nAnalysis \\nThe most common critique of anarchism is that humans cannot self-govern and so a state is necessary for human survival. Philosopher Bertrand Russell supported this critique, stating that \"[p]eace and war, tariffs, regulations of sanitary conditions and the sale of noxious drugs, the preservation of a just system of distribution: these, among others, are functions which could hardly be performed in a community in which there was no central government.\" Another common criticism of anarchism is that it fits a world of isolation in which only the small enough entities can be self-governing; a response would be that major anarchist thinkers advocated anarchist federalism.\\n\\nPhilosophy lecturer Andrew G. Fiala composed a list of common arguments against anarchism which includes critiques such as that anarchism is innately related to violence and destruction, not only in the pragmatic world, such as at protests, but in the world of ethics as well. Secondly, anarchism is evaluated as unfeasible or utopian since the state cannot be defeated practically. This line of arguments most often calls for political action within the system to reform it. The third argument is that anarchism is self-contradictory. While it advocates for no-one to archiei, if accepted by the many, then anarchism would turn into the ruling political theory. In this line of criticism also comes the self-contradiction that anarchism calls for collective action whilst endorsing the autonomy of the individual, hence no collective action can be taken. Lastly, Fiala mentions a critique towards philosophical anarchism of being ineffective (all talk and thoughts) and in the meantime capitalism and bourgeois class remains strong.\\n\\nPhilosophical anarchism has met the criticism of members of academia following the release of pro-anarchist books such as A. John Simmons\\' Moral Principles and Political Obligations. Law professor William A. Edmundson authored an essay to argue against three major philosophical anarchist principles which he finds fallacious. Edmundson says that while the individual does not owe the state a duty of obedience, this does not imply that anarchism is the inevitable conclusion and the state is still morally legitimate. In The Problem of Political Authority, Michael Huemer defends philosophical anarchism, claiming that \"political authority is a moral illusion.\"\\n\\nOne of the earliest criticisms is that anarchism defies and fails to understand the biological inclination to authority. Joseph Raz states that the acceptance of authority implies the belief that following their instructions will afford more success. Raz believes that this argument is true in following both authorities\\' successful and mistaken instruction. Anarchists reject this criticism because challenging or disobeying authority does not entail the disappearance of its advantages by acknowledging authority such as doctors or lawyers as reliable, nor does it involve a complete surrender of independent judgment. Anarchist perception of human nature, rejection of the state, and commitment to social revolution has been criticised by academics as naive, overly simplistic, and unrealistic, respectively. Classical anarchism has been criticised for relying too heavily on the belief that the abolition of the state will lead to human cooperation prospering.\\n\\nFriedrich Engels, considered to be one of the principal founders of Marxism, criticised anarchism\\'s anti-authoritarianism as inherently counter-revolutionary because in his view a revolution is by itself authoritarian. Academic John Molyneux writes in his book Anarchism: A Marxist Criticism that \"anarchism cannot win\", believing that it lacks the ability to properly implement its ideas. The Marxist criticism of anarchism is that it has a utopian character because all individuals should have anarchist views and values. According to the Marxist view, that a social idea would follow directly from this human ideal and out of the free will of every individual formed its essence. Marxists state that this contradiction was responsible for their inability to act. In the anarchist vision, the conflict between liberty and equality was resolved through coexistence and intertwining.\\n\\nSee also \\n\\n Anarchism by country\\n Governance without government\\n List of anarchist political ideologies\\n List of books about anarchism\\n\\nReferences\\n\\nCitations\\n\\nNotes\\n\\nSources\\n\\nPrimary sources\\n\\nSecondary sources\\n\\nTertiary sources\\n\\nFurther reading \\n \\n  Criticism of philosophical anarchism.\\n \\n  A defence of philosophical anarchism, stating that \"both kinds of \\'anarchism\\' [i.e. philosophical and political anarchism] are philosophical and political claims.\" (p.\\xa0137)\\n  Anarchistic popular fiction novel.\\n \\n \\n \\n  An argument for philosophical anarchism.\\n\\nExternal links \\n Anarchy Archives. Anarchy Archives is an online research center on the history and theory of anarchism.\\n\\n \\nAnti-capitalism\\nAnti-fascism\\nEconomic ideologies\\nLeft-wing politics\\nLibertarian socialism\\nLibertarianism\\nPolitical culture\\nPolitical movements\\nPolitical ideologies\\nSocial theories\\nSocialism\\nFar-left politics'}"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(iter(en['train']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': '3',\n",
       " 'url': 'https://fr.wikipedia.org/wiki/Antoine%20Meillet',\n",
       " 'title': 'Antoine Meillet',\n",
       " 'text': \"Paul Jules Antoine Meillet, né le  à Moulins (Allier) et mort le  à Châteaumeillant (Cher), est le principal linguiste français des premières décennies du . Il est aussi philologue.\\n\\nBiographie \\nD'origine bourbonnaise, fils d'un notaire de Châteaumeillant (Cher), Antoine Meillet fait ses études secondaires au lycée de Moulins.\\n\\nÉtudiant à la faculté des lettres de Paris à partir de 1885 où il suit notamment les cours de Louis Havet, il assiste également à ceux de Michel Bréal au Collège de France et de Ferdinand de Saussure à l'École pratique des hautes études.\\n\\nEn 1889, il est major de l'agrégation de grammaire.\\n\\nIl assure à la suite de Saussure le cours de grammaire comparée, qu'il complète à partir de 1894 par une conférence sur les langues persanes.\\n\\nEn 1897, il soutient sa thèse pour le doctorat ès lettres (Recherches sur l'emploi du génitif-accusatif en vieux-slave). En 1905, il occupe la chaire de grammaire comparée au Collège de France, où il consacre ses cours à l'histoire et à la structure des langues indo-européennes. Il succéda au linguiste Auguste Carrière à la tête de la chaire d'arménien à l'École des langues orientales.\\n\\nSecrétaire de la Société de linguistique de Paris, il est élu à l'Académie des inscriptions et belles-lettres en 1924. Il préside également l'Institut d'Études Slaves de 1921 à sa mort.\\n\\nIl a formé toute une génération de linguistes français, parmi lesquels Émile Benveniste, Marcel Cohen, Georges Dumézil, André Martinet, Aurélien Sauvageot, Lucien Tesnière, Joseph Vendryes, ainsi que le japonisant Charles Haguenauer. Antoine Meillet devait diriger la thèse de Jean Paulhan sur la sémantique du proverbe et c'est lui qui découvrit Gustave Guillaume.\\n\\nIl a influencé aussi un certain nombre de linguistes étrangers. Il a également été le premier à identifier le phénomène de la grammaticalisation.\\n\\nSelon le linguiste allemand Walter Porzig, Meillet est un « grand précurseur ». Il montre, par exemple, que, dans les dialectes indo-européens, les groupes indo-européens sont le résultat historique d'une variation diatopique.\\n\\nL’acte de naissance de la sociolinguistique est signé par Antoine Meillet fondateur de la sociolinguistique qui s’est opposé au Cours de linguistique générale de Ferdinand de Saussure dès son apparition en 1916 en le critiquant sur plusieurs plans.\\n\\nÉtudes arméniennes \\n 1890 : une mission de trois mois dans le Caucase lui permet d'apprendre l'arménien moderne.\\n 1902 : il obtient la chaire d'arménien de l'École des langues orientales.\\n 1903 : nouvelle mission en Arménie russe, il publie son Esquisse d'une grammaire comparée de l'arménien classique, qui demeure une référence en linguistique arménienne et indo-européenne jusqu'à ce jour. L'un de ses étudiants, Hratchia Adjarian, devient le fondateur de la dialectologie arménienne. C'est également sous les encouragements de Meillet qu'Émile Benveniste étudie la langue arménienne.\\n 1919 : il est cofondateur de la Société des études arméniennes avec Victor Bérard, Charles Diehl, André-Ferdinand Hérold, H. Lacroix, Frédéric Macler, Gabriel Millet, Gustave Schlumberger.\\n 1920 : le , il crée la Revue des études arméniennes avec Frédéric Macler.\\n\\nÉtudes homériques \\nÀ la Sorbonne, Meillet supervise le travail de Milman Parry. Meillet offre à son étudiant l'opinion, nouvelle à cette époque, que la structure formulaïque de l'Iliade serait une conséquence directe de sa transmission orale. Ainsi, il le dirige vers l'étude de l'oralité dans son cadre natif et lui suggère d'observer les mécanismes d'une tradition orale vivante à côté du texte classique (l'Iliade) qui est censé résulter d'une telle tradition. En conséquence, Meillet présente Parry à Matija Murko, savant originaire de Slovénie qui avait longuement écrit sur la tradition héroïque épique dans les Balkans, surtout en Bosnie-Herzégovine. Par leurs recherches, dont les résultats sont à présent hébergés par l'université de Harvard, Parry et son élève, Albert Lord, ont profondément renouvelé les études homériques.\\n\\nPrincipaux ouvrages \\n Études sur l'étymologie et le vocabulaire du vieux slave. Paris, Bouillon, 1902-05.\\n Esquisse d'une grammaire comparée de l'arménien classique, 1903.\\n Introduction à l'étude comparative des langues indo-européennes, 1903 ( éd.), Hachette, Paris, 1912 ( éd.).\\n Les dialectes indo-européens, 1908.\\n Aperçu d'une histoire de la langue grecque, 1913.\\n Altarmenisches Elementarbuch, 1913. Heidelberg (en français : Manuel élémentaire d'Arménien classique, traduction de Gabriel Képéklian, Limoges, Lambert-Lucas, 2017 )\\n Caractères généraux des langues germaniques, 1917, rev. edn. 1949.\\n Linguistique historique et linguistique générale, 1921 (le tome II est paru en 1936 ; les deux tomes ont été réunis chez Lambert-Lucas, Limoges, 2015).\\n Les origines indo-européennes des mètres grecs, 1923.\\n Traité de grammaire comparée des langues classiques, 1924 (avec Joseph Vendryés).\\n La méthode comparative en linguistique historique, 1925, Oslo, Instituttet for Sammenlignende Kulturforskning (réimpr. Paris, Champion, 1954).\\n .\\n Dictionnaire étymologique de la langue latine, 1932 (en collab. Avec Alfred Ernout (1879-1973), éd. augmentée, par Jacques André (1910-1994), Paris : Klincksieck, 2001,  \\n Meillet en Arménie, 1891, 1903, Journaux et lettres publiés par Francis Gandon, Limoges, Lambert-Lucas, 2014, .\\n\\nNotes et références\\n\\nVoir aussi\\n\\nBibliographie \\n Marc Décimo, Sciences et pataphysique, t. 2 : Comment la linguistique vint à Paris ?, De Michel Bréal à Ferdinand de Saussure, Dijon, Les Presses du réel, coll. Les Hétéroclites, 2014 .\\n\\nArticles connexes \\n Franz Bopp\\n Johann Kaspar Zeuss\\n\\nLiens externes \\n \\n \\n \\n\\nCommandeur de la Légion d'honneur\\nAcadémie des inscriptions et belles-lettres\\nAgrégé de grammaire\\nLinguiste français\\nPhilologue français\\nSlaviste\\nPersonnalité liée à la langue kurde\\nInstitut national des langues et civilisations orientales\\nArménologue français\\nIndo-européaniste\\nÉtudiant de l'université de Paris\\nNaissance en novembre 1866\\nNaissance à Moulins (Allier)\\nDécès en septembre 1936\\nDécès à 69 ans\\nDécès dans le Cher\\nPersonnalité inhumée à Moulins\"}"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "next(iter(fr['train']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LanceDB Embeddings API\n",
    "Let's see how you can use the embeddings API to create an ingestion pipeline that automatically does all the vectorization for you both when ingesting new data or searching queries."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### OpenAI API Example\n",
    "Let us take a look at openAI example first. LanceDB comes with OpenAI embedding function support.\n",
    "* Create the instance of the available embedding function or create your own\n",
    "* Create the scheme of the table, marking source end vector fields. Each embedding function can have multiple source and vector feilds\n",
    "* Create a table with schema\n",
    "\n",
    "Doing this creates a table with where embedding function information is ingested as metadata so you can forget about all the modelling details and focus only ingesting and retrieving data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import lancedb\n",
    "import getpass\n",
    "from lancedb.embeddings import EmbeddingFunctionRegistry\n",
    "from lancedb.pydantic import LanceModel, Vector\n",
    "\n",
    "if \"OPENAI_API_KEY\" not in os.environ:\n",
    "    os.environ['OPENAI_API_KEY'] = getpass.getpass(\"Enter your OpenAI API key: \")\n",
    "    \n",
    "registry = EmbeddingFunctionRegistry().get_instance()\n",
    "openai = registry.get(\"openai\").create() # uses multi-lingual model by default (768 dim)\n",
    "\n",
    "class Schema(LanceModel):\n",
    "    vector: Vector(openai.ndims()) = openai.VectorField()\n",
    "    text: str = openai.SourceField()\n",
    "    url: str\n",
    "    title: str\n",
    "    id: str\n",
    "    lang: str\n",
    "\n",
    "db = lancedb.connect(\"~/lancedb\")\n",
    "tbl_openai = db.create_table(\"wikipedia-openai\", schema=Schema, mode=\"overwrite\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Cohere Embedding Table\n",
    "Now let's see another example using cohere embedding function which is also supported directly by LanceDB. We will follow the same steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import lancedb\n",
    "import getpass\n",
    "from lancedb.embeddings import EmbeddingFunctionRegistry\n",
    "from lancedb.pydantic import LanceModel, Vector\n",
    "\n",
    "if \"COHERE_API_KEY\" not in os.environ:\n",
    "    os.environ['COHERE_API_KEY'] = getpass.getpass(\"Enter your Cohere API key: \")\n",
    "    \n",
    "registry = EmbeddingFunctionRegistry().get_instance()\n",
    "cohere = registry.get(\"cohere\").create() # uses multi-lingual model by default (768 dim)\n",
    "\n",
    "class Schema(LanceModel):\n",
    "    vector: Vector(cohere.ndims()) = cohere.VectorField()\n",
    "    text: str = cohere.SourceField()\n",
    "    url: str\n",
    "    title: str\n",
    "    id: str\n",
    "    lang: str\n",
    "\n",
    "db = lancedb.connect(\"~/lancedb\")\n",
    "tbl_cohere = db.create_table(\"wikipedia-cohere\", schema=Schema, mode=\"overwrite\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Ingest data\n",
    "Now, we have the table set up for ingesting the dataset. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 10/10 [06:10<00:00, 37.08s/it]\n"
     ]
    }
   ],
   "source": [
    "from tqdm.auto import tqdm\n",
    "import time\n",
    "# let's use cohere embeddings. Use can also set it to openai version of the table\n",
    "tbl = tbl_cohere\n",
    "batch_size = 1000\n",
    "num_records = 10000\n",
    "data = []\n",
    "\n",
    "for i in tqdm(range(0, num_records, batch_size)):\n",
    "\n",
    "    for lang, dataset in datasets.items():\n",
    "        \n",
    "        batch = [next(dataset) for _ in range(batch_size)]\n",
    "        \n",
    "        texts = [x['text'] for x in batch]\n",
    "        ids = [f\"{x['id']}-{lang}\" for x in batch]\n",
    "        data.extend({\n",
    "           'text': x['text'], 'title': x['title'], 'url': x['url'], 'lang': lang, 'id': f\"{lang}-{x['id']}\"\n",
    "        } for x in batch)\n",
    "\n",
    "    # add in batches to avoid token limit\n",
    "    tbl.add(data)\n",
    "    data = []\n",
    "    time.sleep(20) # wait for 20 seconds to avoid rate limit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Searching multi-lingual embedding space\n",
    "Let us now search the table with a substring from a random batch in french"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': '12',\n",
       " 'url': 'https://fr.wikipedia.org/wiki/Arm%C3%A9e%20r%C3%A9publicaine%20irlandaise',\n",
       " 'title': 'Armée républicaine irlandaise',\n",
       " 'text': \"L'Armée républicaine irlandaise (, IRA ; ) est le nom porté, depuis le début du , par plusieurs organisations paramilitaires luttant par les armes contre la présence britannique en Irlande du Nord. Les différents groupes se référent à eux comme Óglaigh na hÉireann (« volontaires d'Irlande »).\\n\\n L' appelée aussi Old IRA, issue de l'union en 1916 entre l' (proche du Parti travailliste irlandais) et les Irish Volunteers (alors généralement proches de l'IRB), est active entre  et , pendant la guerre d'indépendance irlandaise. Si ceux qui ont accepté le traité anglo-irlandais forment les Forces de Défense irlandaises, une partie de l'organisation, refusant cet accord, se constitue en une nouvelle Irish Republican Army, illégale.\\n L'Irish Republican Army anti-traité apparaît entre avril et  du fait du refus du traité anglo-irlandais par une partie de l'Old IRA. Elle participe ainsi à la guerre civile irlandaise de  à . Elle maintient son activité dans les deux Irlandes (État libre d'Irlande, indépendant, et Irlande du Nord, britannique), mais concentre son action sur les intérêts britanniques, surtout en Irlande du Nord. En 1969 l'organisation se divise, donnant naissance à lOfficial Irish Republican Army et à la Provisional Irish Republican Army, minoritaire, moins socialiste et plus activiste.\\n LOfficial Irish Republican Army, proche de l'''Official Sinn Féin, plus socialiste et moins nationaliste que la Provisional Irish Republican Army, mène des campagnes d'attentats principalement entre 1969 et 1972 durant le conflit nord-irlandais, avant de décréter un cessez-le-feu.\\n La Provisional Irish Republican Army, minoritaire après la scission de 1969 (d'où son nom de provisional, «\\xa0provisoire\\xa0») devient rapidement grâce à son militantisme la principale organisation armée républicaine du conflit nord-irlandais. Le terme de provisional est d'ailleurs abandonné vers la fin des années 1970. Elle fut active de 1969 à 1997 (date du cessez-le-feu définitif), puis déposa définitivement les armes en 2005. Refusant le processus de paix, deux organisations scissionnèrent d'avec la PIRA : la Real Irish Republican Army et la Continuity Irish Republican Army.\\n La Continuity Irish Republican Army est issue d'une scission d'avec la Provisional Irish Republican Army dès 1986. Opposée à l'accord du Vendredi saint de 1997, elle continue son action armée jusqu'à aujourd'hui.\\n La Real Irish Republican Army est une scission opposée au processus de paix de la Provisional Irish Republican Army, apparue en 1997 et encore active aujourd'hui.\\n LIrish Republican Liberation Army naît en 2006 d'une scission de la Continuity Irish Republican Army''.\\n\\nGénéalogie de l'Irish Republican Army\"}"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "it = iter(fr['train'])\n",
    "for i in range(5):\n",
    "    next(it)\n",
    "query = next(it)\n",
    "query"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take the first line from the above text body:\n",
    "```\n",
    "L'Armée républicaine irlandaise (, IRA ; ) est le nom porté, depuis le début du , par plusieurs organisations paramilitaires luttant par les armes contre la présence britannique en Irlande du Nord.\n",
    "```\n",
    "This translates to the following in english\n",
    "```\n",
    "The Irish Republican Army (, IRA; ) is the name worn, since the beginning of the 19th century, by several paramilitary organizations fighting with arms against the British presence in Northern Ireland.\n",
    "```\n",
    "\n",
    "Let us now see what at the results that are semantically closer to this in our dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import getpass\n",
    "import lancedb\n",
    "\n",
    "if \"COHERE_API_KEY\" not in os.environ:\n",
    "    os.environ['COHERE_API_KEY'] = getpass.getpass(\"Enter your Cohere API key: \")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can now load the table even in a different session and anything ingest or search will be automatically vectorized. Let us now run the query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "469 ms ± 39.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "db = lancedb.connect(\"~/lancedb\")\n",
    "tbl = db.open_table(\"wikipedia-cohere\") # We just open the existing\n",
    "rs = tbl.search(query[\"text\"]).limit(3).to_list()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " **TEXT id-french-12** \n",
      " L'Armée républicaine irlandaise (, IRA ; ) est le nom porté, depuis le début du , par plusieurs organisations paramilitaires luttant par les armes contre la présence britannique en Irlande du Nord. Les différents groupes se référent à eux comme Óglaigh na hÉireann (« volontaires d'Irlande »).\n",
      "\n",
      " L' appelée aussi Old IRA, issue de l'union en 1916 entre l' (proche du Parti travailliste irlandais) et les Irish Volunteers (alors généralement proches de l'IRB), est active entre  et , pendant la guerre d'indépendance irlandaise. Si ceux qui ont accepté le traité anglo-irlandais forment les Forces de Défense irlandaises, une partie de l'organisation, refusant cet accord, se constitue en une nouvelle Irish Republican Army, illégale.\n",
      " L'Irish Republican Army anti-traité apparaît entre avril et  du fait du refus du traité anglo-irlandais par une partie de l'Old IRA. Elle participe ainsi à la guerre civile irlandaise de  à . Elle maintient son activité dans les deux Irlandes (État libre d'Irlande, indépendant, et Irlande du Nord, britannique), mais concentre son action sur les intérêts britanniques, surtout en Irlande du Nord. En 1969 l'organisation se divise, donnant naissance à lOfficial Irish Republican Army et à la Provisional Irish Republican Army, minoritaire, moins socialiste et plus activiste.\n",
      " LOfficial Irish Republican Army, proche de l'''Official Sinn Féin, plus socialiste et moins nationaliste que la Provisional Irish Republican Army, mène des campagnes d'attentats principalement entre 1969 et 1972 durant le conflit nord-irlandais, avant de décréter un cessez-le-feu.\n",
      " La Provisional Irish Republican Army, minoritaire après la scission de 1969 (d'où son nom de provisional, « provisoire ») devient rapidement grâce à son militantisme la principale organisation armée républicaine du conflit nord-irlandais. Le terme de provisional est d'ailleurs abandonné vers la fin des années 1970. Elle fut active de 1969 à 1997 (date du cessez-le-feu définitif), puis déposa définitivement les armes en 2005. Refusant le processus de paix, deux organisations scissionnèrent d'avec la PIRA : la Real Irish Republican Army et la Continuity Irish Republican Army.\n",
      " La Continuity Irish Republican Army est issue d'une scission d'avec la Provisional Irish Republican Army dès 1986. Opposée à l'accord du Vendredi saint de 1997, elle continue son action armée jusqu'à aujourd'hui.\n",
      " La Real Irish Republican Army est une scission opposée au processus de paix de la Provisional Irish Republican Army, apparue en 1997 et encore active aujourd'hui.\n",
      " LIrish Republican Liberation Army naît en 2006 d'une scission de la Continuity Irish Republican Army''.\n",
      "\n",
      "Généalogie de l'Irish Republican Army \n",
      "\n",
      " **TEXT id-english-14732** \n",
      " The Irish Republican Army (IRA; ) was an Irish republican revolutionary paramilitary organisation. The ancestor of many groups also known as the Irish Republican Army, and distinguished from them as the \"Old IRA\", it was descended from the Irish Volunteers, an organisation established on 25 November 1913 that staged the Easter Rising in April 1916. In 1919, the Irish Republic that had been proclaimed during the Easter Rising was formally established by an elected assembly (Dáil Éireann), and the Irish Volunteers were recognised by Dáil Éireann as its legitimate army. Thereafter, the IRA waged a guerrilla campaign against the British occupation of Ireland in the 1919–1921 Irish War of Independence.\n",
      "\n",
      "Following the signing in 1921 of the Anglo-Irish Treaty, which ended the War of Independence, a split occurred within the IRA. Members who supported the treaty formed the nucleus of the Irish National Army. However, the majority of the IRA was opposed to the treaty. The anti-treaty IRA fought a civil war against the Free State Army in 1922–23, with the intention of creating a fully independent all-Ireland republic. Having lost the civil war, this group remained in existence, with the intention of overthrowing the governments of both the Irish Free State and Northern Ireland and achieving the Irish Republic proclaimed in 1916.\n",
      "\n",
      "Origins\n",
      "\n",
      "The Irish Volunteers, founded in 1913, staged the Easter Rising, which aimed at ending British rule in Ireland, in 1916. Following the suppression of the Rising, thousands of Volunteers were imprisoned or interned, leading to the break-up of the organisation. It was reorganised in 1917 following the release of first the internees and then the prisoners. At the army convention held in Dublin in October 1917, Éamon de Valera was elected president, Michael Collins Director for Organisation and Cathal Brugha Chairman of the Resident Executive, which in effect made him Chief of Staff.\n",
      "\n",
      "Following the success of Sinn Féin in the general election of 1918 and the setting up of the First Dáil (the legislature of the Irish Republic), Volunteers commenced military action against the Royal Irish Constabulary (RIC), the paramilitary police force in Ireland, and subsequently against the British Army. It began with the Soloheadbeg Ambush, when members of the Third Tipperary Brigade led by Séumas Robinson, Seán Treacy, Dan Breen and Seán Hogan, seized a quantity of gelignite, killing two RIC constables in the process.\n",
      "\n",
      "The Dáil leadership worried that the Volunteers would not accept its authority, given that, under their own constitution, they were bound to obey their own executive and no other body. In August 1919, Brugha proposed to the Dáil that the Volunteers be asked to swear allegiance to the Dáil, but one commentator states that another year passed before the movement took an oath of allegiance to the Irish Republic and its government in \"August 1920\". In sharp contrast, a contemporary in the struggle for Irish independence notes that by late 1919, the term \"Irish Republican Army (IRA)\" was replacing \"Volunteers\" in everyday usage. This change is attributed to the Volunteers, having accepted the authority of the Dáil, being referred to as the \"army of the Irish Republic\", popularly known as the \"Irish Republican Army\".\n",
      "\n",
      "A power struggle continued between Brugha and Collins, both cabinet ministers, over who had the greater influence. Brugha was nominally the superior as Minister for Defence, but Collins's power base came from his position as Director of Organisation of the IRA and from his membership on the Supreme Council of the Irish Republican Brotherhood (IRB). De Valera resented Collins's clear power and influence, which he saw as coming more from the secretive IRB than from his position as a Teachta Dála (TD) and minister in the Aireacht. Brugha and de Valera both urged the IRA to undertake larger, more conventional military actions for the propaganda effect but were ignored by Collins and Mulcahy. Brugha at one stage proposed the assassination of the entire British cabinet. This was also discounted due to its presumed negative effect on British public opinion. Moreover, many members of the Dáil, notably Arthur Griffith, did not approve of IRA violence and would have preferred a campaign of passive resistance to the British rule. The Dáil belatedly accepted responsibility for IRA actions in April 1921, just three months before the end of the Irish War of Independence.\n",
      "\n",
      "In practice, the IRA was commanded by Collins, with Richard Mulcahy as second in command. These men were able to issue orders and directives to IRA guerrilla units around the country and at times to send arms and organisers to specific areas. However, because of the localised and irregular character of the war, they were only able to exert limited control over local IRA commanders such as Tom Barry, Liam Lynch in Cork and Seán Mac Eoin in Longford.\n",
      "\n",
      "The IRA claimed a total strength of 70,000, but only about 3,000 were actively engaged in fighting against the Crown. The IRA distrusted those Irishmen who had fought in the British Army during the First World War as potential informers, but there were a number of exceptions such as Emmet Dalton, Tom Barry and Martin Doyle. The IRA divided its members into three classes, namely \"unreliable\", \"reliable\" and \"active\". The \"unreliable\" members were those who were nominally IRA members but did not do very much for the struggle, \"reliable\" members played a supporting role in the war while occasionally fighting and the \"active\" men those who were engaged in full-time fighting. Of the IRA brigades only about one to two-thirds were considered to be \"reliable\" while those considered \"active\" were even smaller. A disproportionate number of the \"active\" IRA men were teachers, medical students, shoemakers and bootmakers; those engaged in building trades like painters, carpenters and bricklayers; draper's assistants and creamery workers. The Canadian historian Peter Hart wrote \"...the guerrillas were disproportionately skilled, trained and urban\". Farmers and fishermen tended to be underrepresented in the IRA. Those Irishmen engaged in white-collar trades or working as skilled labourers were much more likely to be involved in cultural nationalist groups like the Gaelic League than farmers or fishermen, and thus to have a stronger sense of Irish nationalism. Furthermore, the authority of the Crown tended to be stronger in towns and cities than in the countryside. Thus, those engaged in Irish nationalist activities in urban areas were much more likely to come into conflict with the Crown, leading to a greater chance of radicalisation. Finally, the British tactic of blowing up the homes of IRA members had the effect of discouraging many farmers from joining the struggle as the destruction of the family farm could easily reduce a farmer and his family to destitution. Of the \"active\" IRA members, three-quarters were in their late teens or early 20s and only 5% of the \"active\" men were in the age range of 40 or older. The \"active\" members were overwhelmingly single men with only 4% being married or engaged in a relationship. The life of an \"active\" IRA man with its stress of living on the run and constantly being in hiding tended to attract single men who could adjust to this lifestyle far more easily than a man in a relationship. Furthermore, the IRA preferred to recruit single men as it was found that singles could devote themselves more wholeheartedly to the struggle.\n",
      "\n",
      "Women were active in the republican movement, but almost no women fought with the IRA whose \"active\" members were almost entirely male. The IRA was not a sectarian group and went out of its way to proclaim it was open to all Irishmen, but its membership was largely Catholic with virtually no Protestants serving as \"active\" IRA men. Hart wrote that in his study of the IRA membership that he found only three Protestants serving as \"active\" IRA men between 1919 and 1921. Of the 917 IRA men convicted by British courts under the Defence of the Realm Act in 1919, only one was a Protestant. The majority of those serving in the IRA were practising Catholics, but there was a large minority of \"pagans\" as atheists or non-practising Catholics were known in Ireland. The majority of the IRA men serving in metropolitan Britain were permanent residents with very few sent over from Ireland. The majority of the IRA men operating in Britain were Irish-born, but there a substantial minority who were British-born, something that made them especially insistent on asserting their Irish identity.\n",
      "\n",
      "Irish War of Independence\n",
      "\n",
      "IRA campaign and organisation\n",
      "\n",
      "The IRA fought a guerrilla war against the Crown forces in Ireland from 1919 to July 1921. The most intense period of the war was from November 1920 onwards. The IRA campaign can broadly be split into three phases. The first, in 1919, involved the re-organisation of the Irish Volunteers as a guerrilla army and only sporadic attacks. Organisers such as Ernie O'Malley were sent around the country to set up viable guerrilla units. On paper, there were 100,000 or so Volunteers enrolled after the conscription crisis of 1918. However, only about 15,000 of these participated in the guerrilla war. In 1919, Collins, the IRA's Director of Intelligence, organised the \"Squad\"—an assassination unit based in Dublin which killed police involved in intelligence work (the Irish playwright Brendan Behan's father Stephen Behan was a member of the Squad). Typical of Collins's sardonic sense of humour, the Squad was often referred to as his \"Twelve Apostles\". In addition, there were some arms raids on RIC barracks. By the end of 1919, four Dublin Metropolitan Police and 11 RIC men had been killed. The RIC abandoned most of their smaller rural barracks in late 1919. Around 400 of these were burned in a co-ordinated IRA operation around the country in April 1920.\n",
      "\n",
      "The second phase of the IRA campaign, roughly from January to July 1920, involved attacks on the fortified police barracks located in the towns. Between January and June 1920, 16 of these were destroyed and 29 badly damaged. Several events of late 1920 greatly escalated the conflict. Firstly, the British declared martial law in parts of the country—allowing for internment and executions of IRA men. Secondly they deployed paramilitary forces, the Black and Tans and Auxiliary Division, and more British Army personnel into the country. Thus, the third phase of the war (roughly August 1920 – July 1921) involved the IRA taking on a greatly expanded British force, moving away from attacking well-defended barracks and instead using ambush tactics. To this end the IRA was re-organised into \"flying columns\"—permanent guerrilla units, usually about 20 strong, although sometimes larger. In rural areas, the flying columns usually had bases in remote mountainous areas.\n",
      "\n",
      "The most high-profile violence of the war took place in Dublin in November 1920 and is still known as Bloody Sunday. In the early hours of the morning, Collins' \"Squad\" killed fourteen British spies. In reprisal, that afternoon, British forces opened fire on a football crowd at Croke Park, killing 14 civilians. Towards the end of the day, two prominent Republicans and a friend of theirs were arrested and killed by Crown Forces.\n",
      "\n",
      "While most areas of the country saw some violence in 1919–1921, the brunt of the war was fought in Dublin and the southern province of Munster. In Munster, the IRA carried out a significant number of successful actions against British troops, for instance, the ambushing and killing of 16 of 18 Auxiliaries by Tom Barry's column at Kilmicheal in West Cork in November 1920, or Liam Lynch's men killing 13 British soldiers near Millstreet early in the next year. At the Crossbarry Ambush in March 1921, 100 or so of Barry's men fought a sizeable engagement with a British column of 1,200, escaping from the British encircling manoeuvre. In Dublin, the \"Squad\" and elements of the IRA Dublin Brigade were amalgamated into the \"Active Service Unit\", under Oscar Traynor, which tried to carry out at least three attacks on British troops a day. Usually, these consisted of shooting or grenade attacks on British patrols. Outside Dublin and Munster, there were only isolated areas of intense activity. For instance, the County Longford IRA under Seán Mac Eoin carried out a number of well-planned ambushes and successfully defended the village of Ballinalee against Black and Tan reprisals in a three-hour gun battle. In County Mayo, large-scale guerrilla action did not break out until spring 1921, when two British forces were ambushed at Carrowkennedy and Tourmakeady. Elsewhere, fighting was more sporadic and less intense.\n",
      "\n",
      "In Belfast, the war had a character all of its own. The city had a Protestant and unionist majority and IRA actions were responded to with reprisals against the Catholic population, including killings (such as the McMahon killings) and the burning of many homes – as on Belfast's Bloody Sunday. The IRA in Belfast and the North generally, although involved in protecting the Catholic community from loyalists and state forces, undertook a retaliatory arson campaign against factories and commercial premises. The violence in Belfast alone, which continued until October 1922 (long after the truce in the rest of the country), claimed the lives of between 400 and 500 people.\n",
      "\n",
      "In April 1921, the IRA was again reorganised, in line with the Dáil's endorsement of its actions, along the lines of a regular army. Divisions were created based on region, with commanders being given responsibility, in theory, for large geographical areas. In practice, this had little effect on the localised nature of the guerrilla warfare.\n",
      "\n",
      "In May 1921, the IRA in Dublin attacked and burned the Custom House. The action was a serious setback as five members were killed and eighty captured.\n",
      "\n",
      "By the end of the war in July 1921, the IRA was hard-pressed by the deployment of more British troops into the most active areas and a chronic shortage of arms and ammunition. It has been estimated that the IRA had only about 3,000 rifles (mostly captured from the British) during the war, with a larger number of shotguns and pistols. An ambitious plan to buy arms from Italy in 1921 collapsed when the money did not reach the arms dealers. Towards the end of the war, some Thompson submachine guns were imported from the United States; however 450 of these were intercepted by the American authorities and the remainder only reached Ireland shortly before the Truce.\n",
      "\n",
      "By June 1921, Collins' assessment was that the IRA was within weeks, possibly even days, of collapse. It had few weapons or ammunition left. Moreover, almost 5,000 IRA men had been imprisoned or interned and over 500 killed. Collins and Mulcahy estimated that the number of effective guerrilla fighters was down to 2,000–3,000. However, in the summer of 1921, the war was abruptly ended.\n",
      "\n",
      "The British recruited hundreds of World War I veterans into the RIC and sent them to Ireland. Because there was initially a shortage of RIC uniforms, the veterans at first wore a combination of dark green RIC uniforms and khaki British Army uniforms, which inspired the nickname \"Black and Tans\". The brutality of the Black and Tans is now well-known, although the greatest violence attributed to the Crown's forces was often that of the Auxiliary Division of the Constabulary. One of the strongest critics of the Black and Tans was King George V who in May 1921 told Lady Margery Greenwood that \"he hated the idea of the Black and Tans.\"\n",
      "\n",
      "The IRA was also involved in the destruction of many stately homes in Munster. The Church of Ireland Gazette recorded numerous instances of Unionists and Loyalists being shot, burnt or forced from their homes during the early 1920s. In County Cork between 1920 and 1923 the IRA shot over 200 civilians of whom over 70 (or 36%) were Protestants: five times the percentage of Protestants in the civilian population. This was due to the historical inclination of Protestants towards loyalty to the United Kingdom. A convention of Irish Protestant Churches in Dublin in May 1922 signed a resolution placing \"on record\" that \"hostility to Protestants by reason of their religion has been almost, if not wholly, unknown in the twenty-six counties in which Protestants are in the minority.\"\n",
      "\n",
      "Many historic buildings in Ireland were destroyed during the war, most famously the Custom House in Dublin, which was disastrously attacked on de Valera's insistence, to the horror of the more militarily experienced Collins. As he feared, the destruction proved a pyrrhic victory for the Republic, with so many IRA men killed or captured that the IRA in Dublin suffered a severe blow.\n",
      "\n",
      "This was also a period of social upheaval in Ireland, with frequent strikes as well as other manifestations of class conflict. In this regard, the IRA acted to a large degree as an agent of social control and stability, driven by the need to preserve cross-class unity in the national struggle, and on occasion being used to break strikes.\n",
      "\n",
      "Assessments of the effectiveness of the IRA's campaign vary. They were never in a position to engage in conventional warfare. The political, military and financial costs of remaining in Ireland were higher than the British government was prepared to pay and this in a sense forced them into negotiations with the Irish political leaders. According to historian Michael Hopkinson, the guerrilla warfare \"was often courageous and effective\". Historian David Fitzpatrick observes, \"The guerrilla fighters...were vastly outnumbered by the forces of the Crown... The success of the Irish Volunteers in surviving so long is therefore noteworthy.\"\n",
      "\n",
      "Truce and treaty\n",
      "\n",
      "David Lloyd George, the British Prime Minister, at the time, found himself under increasing pressure (both internationally and from within the British Isles) to try to salvage something from the situation. This was a complete reversal on his earlier position. He had consistently referred to the IRA as a \"murder gang\" up until then. An unexpected olive branch came from King George V, who, in a speech in Belfast called for reconciliation on all sides, changed the mood and enabled the British and Irish Republican governments to agree to a truce. The Truce was agreed on 11 July 1921. On 8 July, de Valera met General Nevil Macready, the British commander in chief in Ireland and agreed terms. The IRA was to retain its arms and the British Army was to remain in barracks for the duration of peace negotiations. Many IRA officers interpreted the truce only as a temporary break in fighting. They continued to recruit and train volunteers, with the result that the IRA had increased its number to over 72,000 men by early 1922.\n",
      "\n",
      "Negotiations on an Anglo-Irish Treaty took place in late 1921 in London. The Irish delegation was led by Arthur Griffith and Michael Collins.\n",
      "\n",
      "The most contentious areas of the Treaty for the IRA were abolition of the Irish Republic declared in 1919, the status of the Irish Free State as a dominion in the British Commonwealth and the British retention of the so-called Treaty Ports on Ireland's south coast. These issues were the cause of a split in the IRA and ultimately, the Irish Civil War.\n",
      "\n",
      "Under the Government of Ireland Act 1920, Ireland was partitioned, creating Northern Ireland and Southern Ireland. Under the terms of the Anglo-Irish agreement of 6 December 1921, which ended the war (1919–21), Northern Ireland was given the option of withdrawing from the new state, the Irish Free State, and remaining part of the United Kingdom. The Northern Ireland parliament chose to do that. An Irish Boundary Commission was then set up to review the border.\n",
      "\n",
      "Irish leaders expected that it would so reduce Northern Ireland's size, by transferring nationalist areas to the Irish Free State, as to make it economically unviable. Partition was not by itself the key breaking point between pro- and anti-Treaty campaigners; both sides expected the Boundary Commission to greatly reduce Northern Ireland. Moreover, Michael Collins was planning a clandestine guerrilla campaign against the Northern state using the IRA. In early 1922, he sent IRA units to the border areas and sent arms to northern units. It was only afterwards, when partition was confirmed, that a united Ireland became the preserve of anti-Treaty Republicans.\n",
      "\n",
      "IRA and the Anglo-Irish Treaty\n",
      "\n",
      "The IRA leadership was deeply divided over the decision by the Dáil to ratify the Treaty. Despite the fact that Michael Collins – the de facto leader of the IRA – had negotiated the Treaty, many IRA officers were against it. Of the General Headquarters (GHQ) staff, nine members were in favour of the Treaty while four opposed it. The majority of the IRA rank-and-file were against the Treaty; in January–June 1922, their discontent developed into open defiance of the elected civilian Provisional government of Ireland.\n",
      "\n",
      "Both sides agreed that the IRA's allegiance was to the (elected) Dáil of the Irish Republic, but the anti-Treaty side argued that the decision of the Dáil to accept the Treaty (and set aside the Irish Republic) meant that the IRA no longer owed that body its allegiance. They called for the IRA to withdraw from the authority of the Dáil and to entrust the IRA Executive with control over the army. On 16 January, the first IRA division – the 2nd Southern Division led by Ernie O'Malley – repudiated the authority of the GHQ. A month later, on 18 February, Liam Forde, O/C of the IRA Mid-Limerick Brigade, issued a proclamation stating that: \"We no longer recognise the authority of the present head of the army, and renew our allegiance to the existing Irish Republic\". This was the first unit of the IRA to break with the pro-Treaty government.\n",
      "\n",
      "On 22 March, Rory O'Connor held what was to become an infamous press conference and declared that the IRA would no longer obey the Dáil as (he said) it had violated its Oath to uphold the Irish Republic. He went on to say that \"we repudiate the Dáil ... We will set up an Executive which will issue orders to the IRA all over the country.\" In reply to the question on whether this meant they intended to create a military dictatorship, O'Connor said: \"You can take it that way if you like.\"\n",
      "\n",
      "On 28 March, the (anti-Treaty) IRA Executive issued statement stating that Minister of Defence (Richard Mulcahy) and the Chief-of-Staff (Eoin O'Duffy) no longer exercised any control over the IRA. In addition, it ordered an end to the recruitment to the new military and police forces of the Provisional Government. Furthermore, it instructed all IRA units to reaffirm their allegiance to the Irish Republic on 2 April.\n",
      "The stage was set for civil war over the Treaty.\n",
      "\n",
      "Civil War\n",
      "\n",
      "The pro-treaty IRA soon became the nucleus of the new (regular) Irish National Army created by Collins and Richard Mulcahy. British pressure, and tensions between the pro- and anti-Treaty factions of the IRA, led to a bloody civil war, ending in the defeat of the anti-Treaty faction. On 24 May 1923, Frank Aiken, the (anti-treaty) IRA Chief-of-Staff, called a cease-fire. Many left political activity altogether, but a minority continued to insist that the new Irish Free State, created by the \"illegitimate\" Treaty, was an illegitimate state.  They asserted that their \"IRA Army Executive\" was the real government of a still-existing Irish Republic. The IRA of the Civil War and subsequent organisations that have used the name claim lineage from that group, which is covered in full at Irish Republican Army (1922–1969).\n",
      "\n",
      "For information on later organisations using the name Irish Republican Army, see the table below. For a genealogy of organisations using the name IRA after 1922, see List of organisations known as the Irish Republican Army.\n",
      "\n",
      "See also\n",
      "List of films featuring the Irish Republican Army\n",
      "\n",
      "References\n",
      "\n",
      "Bibliography\n",
      "\n",
      "Further reading\n",
      "\n",
      "External links\n",
      "\n",
      "Bureau of Military History, 1913-1921 at militaryarchives.ie\n",
      "Irish Volunteers History, 1913-1922 at IVCO\n",
      "\n",
      " \n",
      "Institutions of the Irish Republic (1919–1922)\n",
      "Guerrilla organizations\n",
      "Irish republican militant groups\n",
      "National liberation armies\n",
      "Anti-imperialism in Europe \n",
      "\n",
      " **TEXT id-english-5859** \n",
      " The Continuity Irish Republican Army (Continuity IRA or CIRA), styling itself as the Irish Republican Army (), is an Irish republican paramilitary group that aims to bring about a united Ireland. It claims to be a direct continuation of the original Irish Republican Army and the national army of the Irish Republic that was proclaimed in 1916. It emerged from a split in the Provisional IRA in 1986 but did not become active until the Provisional IRA ceasefire of 1994. It is an illegal organisation in the Republic of Ireland and is designated a terrorist organisation in the United Kingdom, New Zealand and the United States. It has links with the political party Republican Sinn Féin (RSF).\n",
      "\n",
      "Since 1994, the CIRA has waged a campaign in Northern Ireland against the British Army and the Police Service of Northern Ireland (PSNI), formerly the Royal Ulster Constabulary. This is part of a wider campaign against the British security forces by dissident republican paramilitaries. It has targeted the security forces in gun attacks and bombings, as well as with grenades, mortars and rockets. The CIRA has also carried out bombings with the goal of causing economic harm and/or disruption, as well as many punishment attacks on alleged criminals.\n",
      "\n",
      "To date, it has been responsible for the death of one PSNI officer. The CIRA is smaller and less active than the Real IRA, and there have been a number of splits within the organisation since the mid-2000s.\n",
      "\n",
      "Origins\n",
      "The Continuity IRA has its origins in a split in the Provisional IRA. In September 1986, the Provisional IRA held a General Army Convention (GAC), the organisation's supreme decision-making body. It was the first GAC in 16 years. The meeting, which like all such meetings was secret, was convened to discuss among other resolutions, the articles of the Provisional IRA constitution which dealt with abstentionism, specifically its opposition to the taking of seats in Dáil Éireann (the parliament of the Republic of Ireland). The GAC passed motions (by the necessary two-thirds majority) allowing members of the Provisional IRA to discuss and debate the taking of parliamentary seats, and the removal of the ban on members of the organisation from supporting any successful republican candidate who took their seat in Dáil Éireann.\n",
      "\n",
      "The Provisional IRA convention delegates opposed to the change in the constitution claimed that the convention was gerrymandered \"by the creation of new IRA organisational structures for the convention, including the combinations of Sligo-Roscommon-Longford and Wicklow-Wexford-Waterford.\" The only IRA body that supported this viewpoint was the outgoing IRA Executive. Those members of the outgoing Executive who opposed the change comprised a quorum. They met, dismissed those in favour of the change, and set up a new Executive. They contacted Tom Maguire, who was a commander in the old IRA and had supported the Provisionals against the Official IRA (see Irish republican legitimatism), and asked him for support. Maguire had also been contacted by supporters of Gerry Adams, then president of Sinn Féin, and a supporter of the change in the Provisional IRA constitution.\n",
      "\n",
      "Maguire rejected Adams' supporters, supported the IRA Executive members opposed to the change, and named the new organisers the Continuity Army Council. In a 1986 statement, he rejected \"the legitimacy of an Army Council styling itself the Council of the Irish Republican Army which lends support to any person or organisation styling itself as Sinn Féin and prepared to enter the partition parliament of Leinster House.\" In 1987, Maguire described the \"Continuity Executive\" as the \"lawful Executive of the Irish Republican Army.\"\n",
      "\n",
      "Campaign\n",
      "\n",
      "Initially, the Continuity IRA did not reveal its existence, either in the form of press statements or paramilitary activity. Although the Garda Síochána had suspicions that the organisation existed, they were unsure of its name, labelling it the \"Irish National Republican Army\". On 21 January 1994, on the 75th anniversary of the First Dáil Éireann, Continuity IRA volunteers offered a \"final salute\" to Tom Maguire by firing over his grave, and a public statement and a photo were published in Saoirse Irish Freedom. In February 1994 it was reported that in previous months Gardaí had found arms dumps along the Cooley Peninsula in County Louth that did not belong to the Provisional IRA, and forensics tests determined had been used for firing practice recently.\n",
      "\n",
      "It was only after the Provisional IRA declared a ceasefire in 1994 that the Continuity IRA became active, announcing its intention to continue the campaign against British rule. The CIRA continues to oppose the Good Friday Agreement and, unlike the Provisional IRA (and the Real IRA in 1998), the CIRA has not announced a ceasefire or agreed to participate in weapons decommissioning—nor is there any evidence that it will. In the 18th Independent Monitoring Commission's report, the RIRA, the CIRA and the Irish National Liberation Army (INLA) were deemed a potential future threat. The CIRA was labelled \"active, dangerous and committed and... capable of a greater level of violent and other crime\". Like the RIRA and RIRA splinter group Óglaigh na hÉireann, it too sought funds for expansion. It is also known to have worked with the INLA.\n",
      "\n",
      "The CIRA has been involved in a number of bombing and shooting incidents. Targets of the CIRA have included the British military, the Northern Ireland police (both the Royal Ulster Constabulary and its successor the Police Service of Northern Ireland). Since the Good Friday Agreement in 1998 the CIRA, along with other paramilitaries opposing the ceasefire, have been involved with a countless number of punishment shootings and beatings. By 2005 the CIRA was believed to be an established presence on the island of Great Britain with the capability of launching attacks. A bomb defused in Dublin in December 2005 was believed to have been the work of the CIRA. In February 2006, the Independent Monitoring Commission (IMC) blamed the CIRA for planting four bombs in Northern Ireland during the final quarter of 2005, as well as several hoax bomb warnings. The IMC also blamed the CIRA for the killings of two former CIRA members in Belfast, who had stolen CIRA weapons and established a rival organisation.\n",
      "\n",
      "The CIRA continued to be active in both planning and undertaking attacks on the PSNI. The IMC said they tried to lure police into ambushes, while they have also taken to stoning and using petrol bombs. In addition, other assaults, robbery, tiger kidnapping, extortion, fuel laundering and smuggling were undertaken by the group. The CIRA also actively took part in recruiting and training members, including disgruntled former Provisional IRA members. As a result of this continued activity the IMC said the group remained \"a very serious threat\".\n",
      "\n",
      "On 10 March 2009 the CIRA claimed responsibility for the fatal shooting of a PSNI officer in Craigavon, County Armagh—the first police fatality in Northern Ireland since 1998. The officer was fatally shot by a sniper as he and a colleague investigated \"suspicious activity\" at a house nearby when a window was smashed by youths causing the occupant to phone the police. The PSNI officers responded to the emergency call, giving a CIRA sniper the chance to shoot and kill officer Stephen Carroll. Carroll was killed two days after the Real IRA's 2009 Massereene Barracks shooting at Massereene Barracks in Antrim. In a press interview with Republican Sinn Féin some days later, regarded by some to be the political wing of the Continuity IRA, Richard Walsh described the attacks as \"acts of war\".\n",
      "\n",
      "In 2013, the Continuity IRA's 'South Down Brigade' threatened a Traveller family in Newry and published a statement in the local newspaper. There were negotiations with community representatives and the CIRA announced the threat was lifted. It was believed the threat was issued after a Traveller feud which resulted in a pipe bomb attack in Bessbrook, near Newry. The Continuity IRA is believed to be strongest in the County Fermanagh – North County Armagh area (Craigavon, Armagh and Lurgan). It is believed to be behind a number of attacks such as pipe bombings, rocket attacks, gun attacks, and the PSNI claimed it orchestrated riots a number of times to lure police officers into areas such as Kilwilkie in Lurgan and Drumbeg in Craigavon in order to attack them. It also claimed the group orchestrated a riot during a security alert in Lurgan. The alert turned out to be a hoax.\n",
      "\n",
      "On Easter 2016, the Continuity IRA marched in paramilitary uniforms through North Lurgan, Co Armagh, without any hindrance from the PSNI who monitored the parade from a police helicopter.\n",
      "\n",
      "In July and August 2019 the CIRA carried out attempted bomb attacks on the PSNI in Craigavon, County Armagh and Wattlebridge, County Fermanagh.\n",
      "\n",
      "On 5 February 2020, a bomb planted by the CIRA was found by the PSNI in a lorry in Lurgan. The CIRA believed the lorry was going to be put on a North Channel ferry to Scotland in January 2020.\n",
      "\n",
      "Claim to legitimacy\n",
      " Similar to the claim put forward by the Provisional IRA after its split from the Official IRA in 1969, the Continuity IRA claims to be the legitimate continuation of the original Irish Republican Army or Óglaigh na hÉireann. This argument is based on the view that the surviving anti-Treaty members of the Second Dáil delegated their \"authority\" to the IRA Army Council in 1938. As further justification for this claim, Tom Maguire, one of those anti-Treaty members of the Second Dáil, issued a statement in favour of the Continuity IRA, just as he had done in 1969 in favour of the Provisionals. J. Bowyer Bell, in his The Irish Troubles, describes Maguire's opinion in 1986: \"abstentionism was a basic tenet of republicanism, a moral issue of principle. Abstentionism gave the movement legitimacy, the right to wage war, to speak for a Republic all but established in the hearts of the people\". Maguire's stature was such that a delegation from Gerry Adams sought his support in 1986, but was rejected.<ref>Robert W. White, Ruairí Ó Brádaigh, The Life and Politics of an Irish Revolutionary, 2006, p. 310.</ref>\n",
      "\n",
      "Relationship to other organisations\n",
      "These changes within the IRA were accompanied by changes on the political side and at the 1986 Sinn Féin Ard Fheis (party conference), which followed the IRA Convention, the party's policy of abstentionism, which forbade Sinn Féin elected representatives from taking seats in the Oireachtas, the parliament of the Republic, was dropped. On 2 November, the 628 delegates present cast their votes, the result being 429 to 161. The traditionalists, having lost at both conventions, walked out of the Mansion House, met that evening at the West County Hotel, and reformed as Republican Sinn Féin (RSF).\n",
      "\n",
      "According to a report in the Cork Examiner, the Continuity IRA's first chief of staff was Dáithí Ó Conaill, who also served as the first chairman of RSF from 1986 to 1987. The Continuity IRA and RSF perceive themselves as forming a \"true\" Republican Movement.\n",
      "\n",
      "Structure and status\n",
      "The leadership of the Continuity IRA is believed to be based in the provinces of Munster and Ulster. It was alleged that its chief of staff was a Limerick man and that a number of other key members were from that county, until their expulsion. Dáithí Ó Conaill was the first chief of staff until 1991. In 2004 the United States (US) government believed the Continuity IRA consisted of fewer than fifty hardcore activists. In 2005, Irish Minister for Justice, Equality and Law Reform Michael McDowell told Dáil Éireann that the organisation had a maximum of 150 members.\n",
      "\n",
      "The CIRA is an illegal organisation under UK (section 11(1) of the Terrorism Act 2000) and ROI law due to the use of 'IRA' in the group's name, in a situation analogous to that of the Real Irish Republican Army (RIRA). Membership of the organisation is punishable by a sentence of up to ten years imprisonment under UK law. On 31 May 2001 Dermot Gannon became the first person to be convicted of membership of the CIRA solely on the word of a Garda Síochána chief superintendent. On 13 July 2004, the US government designated the CIRA as a 'Foreign Terrorist Organization'. This made it illegal for Americans to provide material support to the CIRA, requires US financial institutions to block the group's assets and denies alleged CIRA members visas into the US.\n",
      "\n",
      "External aid and arsenal\n",
      "The US government suspects the Continuity IRA of having received funds and arms from supporters in the United States. Security sources in Ireland have expressed the suspicion that, in co-operation with the RIRA, the Continuity IRA may have acquired arms and materiel from the Balkans. They also suspect that the Continuity IRA arsenal contains some weapons that were taken from Provisional IRA arms dumps, including a few dozen rifles, machine guns, and pistols; a small amount of the explosive Semtex; and a few dozen detonators.\n",
      "\n",
      "Internal tension and splits\n",
      "In 2005, several members of the CIRA, who were serving prison sentences in Portlaoise Prison for paramilitary activity, left the organisation. Some transferred to the INLA landing of the prison, but the majority of those who left are now independent and on E4 landing. The remaining CIRA prisoners have moved to D Wing. Supporters of the Continuity IRA leadership claim that this resulted from an internal disagreement, which although brought to a conclusion, was followed by some people leaving the organisation anyway. Supporters of the disaffected members established the Concerned Group for Republican Prisoners. Most of those who had left went back to the CIRA, or dissociated themselves from the CGRP, which is now defunct.\n",
      "\n",
      "In February 2006, the Independent Monitoring Commission claimed in a report on paramilitary activity that two groups, styling themselves as \"Óglaigh na hÉireann\" and \"Saoirse na hÉireann\", had been formed after a split in the Continuity IRA either in early 2006 or late 2005. The Óglaigh na hÉireann group was responsible for a number of pipe bomb attacks on the PSNI, bomb hoaxes, and robberies, the IMC also claimed the organisation was responsible for the killing of Andrew Burns on 12 February 2008 and was seeking to recruit former members of the RIRA. The Saoirse na hÉireann (SNH) group was composed of \"disaffected and largely young republicans\" and was responsible for a number of bomb hoaxes, two of which took place in September 2006. It was thought to have operated largely in republican areas of Belfast . The groups had apparently ceased operations by early 2009.\n",
      "\n",
      "In 2007, the Continuity IRA was responsible for shooting dead two of its members who had left and attempted to create their own organisation. Upon leaving the CIRA, they had allegedly taken a number of guns with them. The Continuity IRA is believed by Gardaí to have been involved in a number of gangland killings in Dublin and Limerick.\n",
      "\n",
      "In July 2010, members of a \"militant Northern-based faction within the CIRA\" led by a well-known member from south Londonderry claimed to have overthrown the leadership of the organisation. They also claimed that an Army Convention representing \"95 per cent of volunteers\" had unanimously elected a new 12-member Army Executive, which in turn appointed a new seven-member Army Council. The moves came as a result of dissatisfication with the southern-based leadership and the apparent winding-down of military operations. A senior source from RSF said: \"We would see them [the purported new leadership] as just another splinter group that has broken away.\" This organisation is referred to as the Real CIRA.\n",
      "\n",
      "In June 2011 CIRA member Liam Kenny was murdered, allegedly by drug dealers, at his home in Clondalkin, West Dublin. On 28 November 2011 an innocent man was mistakenly shot dead in retaliation for the murder of Liam Kenny. Limerick Real IRA volunteer Rose Lynch pleaded guilty to this murder at the Special Criminal Court and was sentenced to life imprisonment.\n",
      "\n",
      "In July 2012 the CIRA announced it had a new leadership after expelling members who had been working against the organisation.\n",
      "\n",
      "In April 2014 a former leading member of the Belfast Continuity IRA who had been expelled from the organisation, Tommy Crossan, was shot dead.\n",
      "\n",
      "In popular culture\n",
      "The CIRA are depicted in RTÉ's TV series crime drama Love/Hate''.\n",
      "\n",
      "Notes\n",
      "\n",
      "References\n",
      "\n",
      " \n",
      "Irish republican militant groups\n",
      "Organised crime groups in Ireland\n",
      "1986 establishments in Ireland \n",
      "\n"
     ]
    }
   ],
   "source": [
    "for r in rs:\n",
    "    print(f\" **TEXT id-{r['id']}** \\n {r['text']} \\n\")\n",
    "#"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see in the above result, the closest match is the text itself that we used to search. The second closest match in an English text with a similar semantic meaning referring to IRA. This is what a multi-lingual embedding model can do.\n",
    "\n",
    "Find more examples on [VectorDB-recipes](https://github.com/lancedb/vectordb-recipes) repo"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
